ABSTRACT
We study the problem of scheduling tasks in a distributed system where the data (and code) for a program may
reside on a processor different from the one where it will be executed. The scheduling of the tasks is more
complex than classical ones as one must not only take into consideration the processing times but also
communication times. We present an off-line polynomial time approximation algorithm for the case when the
processors can be partitioned into storage (client) and processing (server) nodes. Our algorithm is the first
constant ratio approximation algorithm for this problem. Then we discuss generalizations of our problem,
including an on-line distributed version, as well as versions that allow tasks to access multiple input files and
generate multiple output files that reside in one or more nodes.
Keywords: - Approximation Algorithms, Dual Objective Functions, Minimize Makespan, Scheduling.